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Abstract 

We study sample covariance matrices of the form W = \CC T , where C is a k x n matrix 
with i.i.d. mean zero entries. This is a generalization of so-called Wishart matrices, where the 
entries of C are independent and identically distributed standard normal random variables. Such 
matrices arise in statistics as sample covariance matrices, and the high-dimensional case, when 
k is large, arises in the analysis of DNA experiments. 

We investigate the large deviation properties of the largest and smallest eigenvalues of W 
when either k is fixed and n — > oo, or k„ — > oo with k„ — o(n/ log log n), in the case where the 
squares of the i.i.d. entries have finite exponential moments. Previous results, proving a.s. limits 
of the eigenvalues, only require finite fourth moments. 

Our most explicit results for k large are for the case where the entries of C are ±1 with equal 
probability. We relate the large deviation rate functions of the smallest and largest eigenvalue to 
the rate functions for independent and identically distributed standard normal entries of C. This 
case is of particular interest, since it is related to the problem of the decoding of a signal in a 
code division multiple access system arising in mobile communication systems. In this example, 
k plays the role of the number of users in the system, and n is the length of the coding sequence 
of each of the users. Each user transmits at the same time and uses the same frequency, and the 
codes are used to distinguish the signals of the separate users. The results imply large deviation 
bounds for the probability of a bit error due to the interference of the various users. 

Key words: Sample covariance matrices, large deviations, eigenvalues, CDMA with soft-decision 
parallel interference cancelation. 

1 Introduction 

The sample covariance matrix W of a matrix C with k rows and n columns is defined as —CC T . If 
C has random entries, then the spectrum of W is random as well. Typically, W is studied in the 
case that C has i.i.d. entries, with mean and variance 1. For this kind of C, it is known that when 
k,n — > oo such that k/n — (3, where (3 is a constant, the eigenvalue density tends to a deterministic 
density [19]. The boundaries of the support of this distribution are (1 — \fP)\ and (1 + y/j3) 2 , where 
:r+ = max{0, x}. This suggests that the smallest eigenvalue A m i n converges to (1 — y/]3)+, while the 
largest eigenvalue A max converges to (1 + \/~j3) 2 - Bai and Yin 0] have proved a.s. convergence of A m i n 
to (1 — \f$)\- Bai, Silverstein and Yin [3] proved a.s. convergence of A max to (1 + y//3) 2 , see also 
[23] . The strongest results apply in the case that all entries of C are i.i.d. with mean 0, variance 1 
and finite fourth moment. Related results, including a central limit theorem for the linear spectral 
statistics, can be found in 1, 2 , to which we also refer for an overview of the extensive literature. 

In the special case that the entries of C have a standard normal distribution, W is called a Wishart 
matrix. Wishart matrices play an important role in multivariate statistics as they describe the 
correlation structure in i.i.d. Gaussian multivariate data. For Wishart matrices, the large deviation 
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rate function for the eigenvalue density with rate has been derived by Guionnet [5j and Hiai and 
Petz [TT]. However, the proofs depend heavily on the fact that C has standard normal i.i.d. entries, 
for which the density of the ordered eigenvalues can be explicitly computed. 

In this article, we investigate the large deviation rate functions with rate — of the smallest and 
largest eigenvalue of W, for certain non-Gaussian entries of C. We pose a strong condition on the 
tails of the entries, by requiring that the exponential moment of the square of the entries is bounded 
in a a neighborhood of the origin. We shall also comment on this assumption, which we believe to 
be necessary for our results to apply. 

We let n — > oo, and k is either fixed or tends to infinity not faster than o(nj log log n). Our results 
imply that all eigenvalues tend to 1 and that all other values are large deviations. We obtain the 
asymptotic large deviation rate function of A m ; n and A max when k — > oo. In certain special cases, we 
can show that the asymptotic large deviation rate function is equal to the one for Wishart matrices, 
which can be interpreted as saying that the spectrum of sample covariance matrices with k and n 
large is close to the one for i.i.d. standard normal entries. This proves a kind of universality result 
for the large deviation rate functions. 

This paper is organized as follows. In Section [2l we derive an explicit expression for the large 
deviation rate functions of A m i n and A ma x- In Section [3j we calculate lower bounds for the case that 
the distribution of C m i is symmetric around 0, and \C m t\ < M almost surely, for some M > 0. 
In Section 31 we specialize to the case where C m i — ±1 with equal probability, which arises in an 
application in wireless communication. We describe the implications of our results in this application 
in Section [5l Part of the results for this application have been presented at an electrical engineering 
conference [?]■ 

2 General mean zero entries of C 

In this section, we prove large deviation results for the smallest and largest eigenvalues of sample 
covariance matrices. 

2.1 Large deviations for A min and A max 

Define W — \CC T to be the matrix of sample covariances. We denote by P the law of C and by E 
the corresponding expectation. Throughout the paper, we assume that the i.i.d. real matrix elements 
of C are normalized, i.e., 

E[C ij ] = 0, Var(C«) = l. (1) 

The former implies that a.s., the off diagonal elements of the matrix W converge to zero, the second 
implies that the diagonal elements converge to 1, a.s. By a rescaling argument, the second assumption 
is without loss of generality. 

In this section, we rewrite the probability for a large deviation of the largest and smallest eigen- 
values of W, A max and A m i n , respectively, into that of a large deviation of a sum of i.i.d. random 
variables. This rewrite allows us to use Cramer's Theorem to obtain an expression for the rate func- 
tion. This section gives a heuristic derivation of our result, that will be turned into a proof in Section 
l2~2l 

For any matrix W, and any vector x with k coordinates and norm ||x||2 = 1, we have 

Amin < ( x i Wx) < A max . 

Moreover, for the normalized eigenvector x m ; n corresponding to A m i n , the lower bound is attained, 
while for the normalized x max , corresponding to A max , the upper bound is attained. Therefore, we 
can write 

Pmin(a) = P(A mi „ < a) = P(3x : ||x|| 2 = 1, (x, Wx) < a), (2) 
Pmax(a) = P(A max > a) = P(3x : ||x|| 2 = 1, (x, Wx) > a). 

We use that the above is the probability of a union of events, and bound this probability from 
below by considering only one x, and from above by summing over all x. Since there are uncountably 
many possible x, we will do this approximately by summing over a finite number of vectors. The 
lower bound for the probability yields an upper bound for the rate function, and vice versa. 
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We first heuristically explain the form of the rate function of A max and A m ; n , and highlight the 
proof. The special form of a sample covariance matrix allows us to rewrite 



1 1 " / k \ 2 1 " 

n n \ I n ' 

i—l \m— 1 / i—1 



where 



— ^ ^ ^mCjjii-, (4) 
m— 1 

with S Xi i i.i.d. for i = 1, . . . , m. Define 

I k (a)= inf supfta-logE[e tS 2,i]V (5) 

xeK fc :||x|| 2 = l t V / 

Since E[5 X j] = 1, and t i— > logE[e*' Sx . 1 ] is increasing and convex, we see that, for fixed x, the optimal 
t is non-negative for a > 1 and non-positive for a < 1. The sign of £ will play an important role in 
the proofs in Sections (SHU 

We can now state the first result of this paper. 

Theorem 2.1 Assume that (QJ) holds. Then, 

( a) for all a > 1 and fixed k > 2 

lim sup - - log P( A max > a) < I k (a) , (6) 

n — >oo 

and 

lim inf logP(A max > a) > lim/ fe (a - e), (7) 

(b ) for all < a < 1 and /iied k > 2 

lim sup logP(A min < a) < 4(a), (8) 

n — >oo n 

and 

lim inf - - log P( A min < a) > lim I k (a + e). (9) 

When there exists an e > swc/i i/iat E^ 6 *^ 11 ] < oo and w/ien Var(C^ 1 ) > 0, then Ik{ot) > for all 
a 1. 

We will now discuss the main result in Theorem 12.11 Theorem 12.11 is only useful when I k (a) > 0, 
which we prove under the strong condition that there exists an e > such that E^* 7 "] < oo. 
For example, a.s. limits for the largest and smallest eigenvalues are proved under the much weaker 
condition that the fourth moment of the matrix entries Ci m is finite. However, it is well known that 
the exponential bounds present in large deviations are only valid when the random variables under 
consideration have finite exponential moments (see e.g., Theorem 12.21 below) . In this case, the rate 
functions can be equal to zero, and the large deviation results are rather uninformative. Since the 
eigenvalues are quadratic in the entries {Ci m }i, m , this translates into the above condition, which we 
therefore believe to be necessary. 

Secondly, we note that, due to the occurrence of an infimum over x and a supremum over t, it is 
unclear whether the function a i— > Ik(ct) is continuous. Clearly, when a <— > Ik(ct) is continuous, the 
upper and lower bounds in © and 0, as well as the ones in §E§ and ©, are equal. We will see 
that this is the case for Wishart matrices in Section [2~3l The function a h- > Ik(a) can easily be seen 

to be increasing on [1, oo) and decreasing on (0, 1], since a h- > sup t (ta — logE[e t5x1 ]^ has the same 

monotonicity properties for every fixed x, so that the limits lim e ^o^fc(o ; + e) and lim^o Ik{ot — e) 
exist as monotone limits. The continuity of a i— > Ik(ct) is not obvious. For example, in the simplest 
case where Cy = ±1 with equal probability, we know that the large deviation rate function is not 
continuous, since the largest eigenvalue is at most k. Therefore, P(A max > a) = for any a > k, 
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and, if a t— > Ik{a) is the rate function of A max for a > 1, then Ik{a) = oo for a > k. It remains an 
interesting problem to determine in what cases a i— > -Tfe(a) is continuous. 

Finally, we only prove that ifc(a) > for all a ^ 1 when Var(C^ 1 ) > 0. By the normalization that 
IE[Cn] = 0,E[Cj 2 1 ] = 1, this only excludes the case where C\\ = ±1 with equal probability. This case 
will be investigated in more detail in Theorem 14. li where we shall also prove a lower bound implying 
that ijfe(a) > for all a ^ 1. 

Denote 

I k (a, p) = inf sup (to + s/3 - logE[e* s ^ +sS ^]) . (10) 

x,y6R*!||x||2 = ||y|h=l s.t ^ ' 

<x,y)=0 

Our proof also reveals that, and for all < (3 < 1, a > 1, 

limsup-- log P( A max > a, A min < (3) > I k {a,(3), (11) 

n — >oo Tl 

and 

lim logP(A max > a, A min < f3) < lim J fe (a +£,/?- e). (12) 

For Wishart matrices, for which the entries of C are i.i.d. standard normal, the random variable S x j 
has a standard normal distribution, so that we can explicitly calculate /fc(a). We will elaborate on 
this in Section [231 below. For the case that C m .i = ±1 with equal probabilities, Theorem 12. II and its 
proof have also appeared in [7] . 



2.2 Proof of Theorem [2HK a) and (b) 

In the proof, we will repeatedly make use of the largest-exponent-wins principle. We first give a 
short explanation of this principle. This principle is about the exponential rate of the sum of two (or 
more) probabilities. From this point, we will abbreviate 'exponential rate of a probability' by 'rate'. 
Because of the minus sign, a smaller rate / means a larger exponent, and thus a larger probability. 
Thus, if for two events E\ and E2, both depending on some parameter n, we have 

F(Ei) - e- nh and P{E 2 ) - e~™ /2 

then 

- lim ilo g (P(£i)+P(£ 2 )) ~min{/i,7 2 }. (13) 

n—>oo n 

In words, the principle states that as n — > 00, the smallest exponent (i.e., the largest rate) will become 
negligible. It also implies that 

- lim -logP(.Ei UE2) ~ min{/ 1 ,7 2 }- (14) 

n — >oo fi 

In the proof, we will make essential use of Cramer's Theorem, which we state here for the sake of 
completeness: 

Theorem 2.2 (Cramer's theorem and C her noff bound) Let(Xi) ( *L 1 be a sequence of i.i.d. ran- 
dom variables. Then, for all a > M[Xi], 




while, for all a < E[Xi], 




The upper bounds in hl5) )- FLtyl hold for every n. 

Furthermore, when ¥\e tXl ] < 00 for all t with \t\ < e and some e > 0, then the right-hand sides of 
H15\) and M6\) are strictly positive for all a =/= ~E[Xi\. 
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See e.g., [2"Ul Theorem 1.1, pages 5-6 and Proposition 1.9, page 13] for this result, and see [Sj and [T2] 
for general introductions to large deviation theory. 

For the proof, we start by showing that Ik(a) > for all a ^ 1 when there exists an e > such 
that Ele 1 ^ 11 ] < oo and when Var(Cf 1 ) > 0. For this, we note that, by the Cauchy-Schwarz inequality 
and ((4|), for every x with ||x||2 = 1, 

— ^ t x rn ^ , C mi — ^ ] C m i? 

Tci— 1 m— 1 m— 1 

so that E[e ts *.«] < E[e tc ii] fc < oo whenever there exists an e > such that E[e eC ii] < oo. Thus, 
uniformly in x with ||x||2 = 1, the random variables S 2 i have bounded exponential moments for 
t < e. As a result, the Taylor expansion 

logE[e<<] = t + |var(^) + G(|i| 3 ) (17) 

holds uniformly in x with ||x|| 2 = 1. We compute, since E[S^ J = EfC^J = 1, and for x with 

l|x|| 2 = l, 



nsu = s( ]r - 3 £ x 'n + net,} 5>m = 3 - 3 £ 4, + e^] £ 

mm mm rn 

that 



r 4 



Var(S2 fi ) =3-3^ .<„ + E[C 4 X ] £ 4, - 1 = 2 - 2 ]T x* m + VarfC^) 



Vx 4 



which is bounded, since by assumption E[e' c ' 11 ] < oo. Furthermore, Ylm x m G [0, 1] uniformly in x 
with ||x|| 2 = 1, so that, again uniformly in x with ||x|j 2 = 1, Var(5 xi ) > min{2, Var(Cj ! 1 )} > 0. We 
conclude that, for t sufficiently small, uniformly in x with ||x||2 = 1, and by ignoring higher-order 
Taylor expansion terms of £ i — >■ logE[e* Sx i ] in (fT7|) . which is allowed when |i| is sufficiently small, 

logE[e tS ^] < t + t 2 min{2,Var(C^)}. 
In turn, this implies that for |t| < e small, and uniformly in x with ||xj| 2 = 1, 

Ida) > inf sup (ta - \ogE[e tS ^}) > inf sup (t(a - 1) - % min{2, VaxtCl)}] > 0, 

x6R<=:j|x|| 2 = l | i |< £ V / xeR fe :||x|| 2 =l | t |< £ \ 2 J 

the latter bound holding for every 1 when Var(Cf 1 ) > 0. This completes the proof that ifc(a) > 
for all a ^ 1 when there exists an e > such that E[e eCl1 ] < oo and Var(Cj\) > 0. 

We continue by proving ©-([H]). The proof for A max is similar to the one for A m i n , so we will focus 
on the latter. To obtain the upper bound of the rate of ([3]), we use that for any x' with ||x'|| 2 = 1, 

F(A mi „ < a) = P(3x : (x, Wic) < a) > P((x', Wx') < a). (18) 

Now insert ©. Since x' is fixed, the S 2 , i are i.i.d. variables, and we can apply Cramer's Theorem 
to obtain the upper bound for the rate function for fixed x'. This yields that, for every x', we have 

- liminf - logP(A min < a) < sup (ta - logE^M]") . (19) 

)woo n t \ / 

If we maximize the right hand side over x', then we arrive at Ik(a) as the upper bound, and we have 
proved ([8]). The proof for ([6]) is identical. 

We are left to prove the lower bounds in ([7|) and ((9]). For this, we wish to sum over all possible 
x. We approximate the sphere ||x||2 = 1 by a finite set of vectors x 0) with ||x 0) ||2 = 1, such that the 
distance between two of these vectors is at most d, and observe that 

\{x,Wx) - (x 0) ,VFx 0) )| = |((x-x 0) ),T^x) + (x (3) ,W r (x-x (3) ))| 

= |(x,vf(x-x (3) )) + (x (3) ,yy(x-x U) ))| 

< (||x|| + ||x u) ||).||^||-||x-x <3, || <2A max d. 
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We need that A max < nk, with k some large enough constant, with sufficiently high probability, 
which we will prove first. We have that A max < T w , where T w is the trace of W, since W is 
non-negative. Note that 

n k 

^-EE^- ( 20 ) 

i— 1 m—1 

Thus, T w is a sum of nk i.i.d. variables. 

since E[e tc n] < oo for all t < e, we can use Cramer's Theorem for T w . Therefore, for any k, by 
the Chernoff bound, 

T{T W > nk) < e~ nkI c^ K \ (21) 

where 

I C 2(a) = sup (ta - logE[e tc ii]J . (22) 

Since EfC^J = Var(Cn) = 1, we have that Ic 2 ( K ) > for any n > 1. Therefore, by picking k > 1 
large enough, we can make klc^i^) arbitrarily large. If we take kl c -2(n) larger than Ik(a — e), 
according to (TiU)) . this will not influence the result. (Note that when Ik(ct — e) — oo for all e > 0, 
then we can also let klc^^K) tend to infinity by taking n — > oo.) 
It follows that 

P(A min < a) < P(3x u) : (x w ,fx e ») < a + 2dnk) + ¥(T W > nk) 

< F (( x<3) . W^x (3) ) < a + 2dnk) + ¥(T W > nk) 

3 

< ^supP((x (3) ,VTx (3) ) < a + 2dnk) +P(T W > nk), (23) 

xO) 

with Nd the number of vectors in the hnite approximation of the sphere. The above bound is valid 
for every choice of k, k, a and d. 

We write e = 2dnk and will later let e J. 0. Then, applying the largest-exponent-wins principle 
for k > large enough, as well as Cramer's Theorem together with (j3]), we arrive at 

1 / ts 2 \ 1 

- limsup - logP(A min < a) > inf sup t(a + e) - logE[e xU) - 1 ] + liminf - log N d 

n -nx> n x(j) t V J n^oo n 

> I k {a + e) + liminf -log N d . (24) 

n— >oo 71 

In a similar way, we obtain that 

1 / ts 2 \ 1 

- limsup - logP(A max > a) > inf sup t(a - e) - logE[e + liminf - log N d 

> I k {a-e)+ lim inf - log N d , (25) 

n — >oo fi 

where we take d so small that a — e > 0. 

A simple overestimation of Nd is obtained by first taking [—1,1]* C K fc around the origin, and 
laying a grid on this cube with grid length . We then normalize the centers of these cubes to have 
norm 1. The finite set of vectors consists of the centers of the small cubes of width 2/L. In this case, 

3\/k 

d<——, and N d <L k . (26) 

±j 

Indeed, the first bound follows since, for any vector x, there exists a center of a small cube for which 
all coordinates are at most 1/L away. Therefore, the distance of to this center is at most Since 

x has norm 1, the norm of the center of the cube is in between 1 — ^ and 1 + ^ , and we obtain 
that the distance of x to the normalized center of the small cube is at most 
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when ^ < 1/2. For this choice, we have e = 6nk 3 ^ 2 /L, which we can make small by taking L large. 
We conclude that, for any L < oo, linin^oo i log Nd = 0, so that, for any k > 1 sufficiently large, 

- lim sup - log P( A min < a) > h (a + e) , (28) 

n — >oo Tl 

and 

- lim sup - log P( A max > a) > I k (a - e), (29) 

n — >oo ri 

when the respective right-hand sides are finite. Since the above statement is true for any e, we can 
take e i by letting L f oo. When the right-hand side are infinite, then we conclude that also the 
left-hand sides can be made arbitrarily large by letting L | oo. This completes the proof of ([7]) and 
©. " □ 



To see (1X1]) - (TT2")) . we follow the above proof. We first note that the eigenvectors corresponding to 
A ma i and A min are orthogonal. Therefore, we obtain that 

F(A max > a, A min < /?) = P(3x,y : ||x|| 2 = ||y|| 2 = 1, (x,y) = 0, <x, VFx) > a, (y, Wy) < 0). (30) 

We now proceed as above, and for the lower bound pick any x, y satisfying the requirements in the 
probability on the right hand side. The upper bound is slightly harder. For this, we need to pick 
a finite approximation for the choices of x and y such that ||x|| 2 = ||y||2 = 1 and (x, y) = 0. We 
will now show that we can do this in such a way that the total number of pairs {x (i) , y (i,j) }ij>i is 
bounded by iVj, where Nd is as in (|26|) . 

We pick {x (!) }j>! as in the above proof. Then, for fixed x w , we define a finite number of y such 
that (x (i) ,y) = 0. For this, we consider, for fixed x (i) , only those cubes of width j- around an x (j) , 
for some j, that contain at least one element z having norm 1 and such that (z,x (i) ) = 0. Fix one 
of such cubes. If there arc more such z in this cube around x <j) , then we pick the unique element 
that is closest to x u) . We denote this element by y <J i) . The set of these elements y (j i) will be 
denoted by {y (i,i) }i>i. The finite subset of the set ||x||2 = IMh — 1 and (x, y) = then consists of 
{x« y<«> K,>i. 

We clearly have that every x and y with ||x||2 = ||y||2 = 1 and (x, y) = can be approximated 
by a pair x 0) and y ij,i} such that ||x — x 0) ||2 < d and ||y — y (i ' j) ||2 < 2c?. Then we can complete the 
proof as above. 



2.3 Special case: Wishart matrices 

To give an example, we go to Wishart matrices, for which are i.i.d. standard normal. In this case, 
we can compute Jfe(a) and Ik(ct, 0) explicitly. To compute Jfc(a), we note that, for any x such that 
||x|| 2 = 1, we have that is standard normal. Therefore, 

E| ^" ] " 7T=Tf (31) 

so that 

I k (a) = sup (ta- log (—L=)). (32) 

In order to compute 7fc(a), we note that the maximization problem over t in sup t ta — log (—!==) is 

straightforward, and yields t* — \ — ^ and I k (a) — \ (a — 1 — log a) . Note that Ik (a) is independent 
of k. In particular, we see that a 1— > /fe(a) is continuous, which leads us to the following corollary: 

Corollary 2.3 Let Cij be independent standard normals. Then, 

(a) for all a > 1 and fixed k > 2 

lim --logP(A max >a) = ha -1 -log a), (33) 

n— >oo Ji Z 

(b ) for all < a < 1 and fixed k > 2 

lim sup - - log P( A min < a) = ]- {a - 1 - log a). (34) 

n—*oo n ^ 
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We next turn to the computation of Ik(a,(3). When x and y are such that ||x||2 = ||y||2 = 1 
and (x, y) = 0, then we have that (S'x.i, <Sy,i) are normally distributed. It can easily be seen that 
E[Sx,i] = 0,E[5^ i\ = 1 1 x| | § = 1, so that 5 x ,i and S y> i are standard normal. Moreover, E[5 x .i5 y .i] = 
(x, y) = 0, so that {S x i, <Sy,i) are m fact independent standard normal random variables. Therefore, 

E[e<i+<i] = E[e<i]E[ e <!] = -^=-^==, (35) 

and, for a S [0, 1] and /3 > 1, 

4(a,/3) = sup fta + log - log = I k (a) + I k {(3), (36) 

so that the exponential rate of the probability that A max > a and A m i n < (3 is the exponential rate of 
the product of the probabilities that A max > ot and A m i n < (3. This remarkable form of independence 
seems to be true only for Wishart matrices. 

The above considerations lead to the following corollary: 

Corollary 2.4 Let Cy be independent standard normals. Then, 

lim --logP(A max >a,A min </3) = i(a - 1 -logo) + i(/3- 1 - log/3). (37) 

In the sequel, we will, among other things, investigate cases where, for fc — > oo, the rate function 
J/c (a) for general converges to the Gaussian limit I x (a) — h (a — 1 — log a) . 



3 Asymptotics for the eigenvalues for symmetric and bounded 
entries of C 

In this section, we investigate the case where C m i is symmetric around and |C m j| < M < oo almost 
surely, or C m i is standard normal. To emphasize the role of A:, we will denote the law of W for a 
given k by Pfc. We define the extension to k = oo of Ik{oi) to be 

J„,(a) = inf sup (ta- logE[e tS ^]) , (38) 

xe£ 2 (N):||x|| 2 = l t V / 

where £ 2 (N) is the space of all infinite square-summable sequences, with norm ||x||2 = vSi=i x i • 
The main result in this section is the following theorem: 

Theorem 3.1 Suppose that C m i is symmetric around zero and that |C m j| < M < oo almost surely, 
or C m i is standard normal. Then, for all k n — > oo sitc/i t/ia< A; n = °( i og i" g „ ); 
('aj /or all a > 1, 

lim inf -i log P fc „ (A max > «) < (39) 



^ /or a// < a < 1, 



lim sup - - log P kn ( A max > a) > lim 1^ (a - e) , (40) 

n^oo n eio 

lim inf logP fc „(A min < a) < I x (a), (41) 

n— »oo 72 

lim inf log P fc ( A min < a) > lim I^fa + e). (42) 

A version of this result has also been published in a conference proceeding J7J, for the special 
case C m i = ±1, each with probability 1/2, and where the restriction on k n was k n = 0(r^—). 
Unfortunately, there is a technical error in the proof, and below we present the corrected proof^ In 
order to do so, we will rely on explicit lower bounds for Ik{a) for a > 1. 

A priory, it is not obvious that the limit loo (a) is strictly positive for a / 1. However, in the 
examples we will investigate later on, such as C m i = ±1 with equal probability, we will see that 
indeed I^(a) > for a^l. Possibly, such a result can be shown more generally. 

The following proposition is instrumental in the proof of Theorem 13.11 
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Proposition 3.2 Assume that C m i is symmetric around zero and that |C m ,-| < M < oo almost 
surely, or C m i is standard normal. Then, for all k, a > M 2 and x with 1 1 x j 1 2 = 1, 

P fe ((x, Wx) >a)< e - nJ ^ a \ (43) 

where 

In the case where C m i = ±1, for which M > 1, we will present an improved version of this bound, 
valid when a > 1/2, in Theorem 14. II below. 



3.1 Proof of Proposition [3721 

Throughout this proof, we fix x with ||x||2 = 1- We use (j3]) to bound, for every t > and k S N, by 
the Markov inequality, 

P fc ((x,T^x> > a) = P fe ( e *£?=i s 2,< > e nta ) < e ~«(«t-i°gE fc „[e* s 2,i]) _ ^ 



E fe [e ts ^l < ; 1 (46) 



We claim that for all < t < jfr , 



In the case of Wishart matrices, for which S x ,i has a standard normal distribution, (|46ll holds with 
equality for M = 1. 

We first note that l[46|) is proven in [15j Section IV], for the case that dj = ±1 with equal 
probability. For any k and x, the bound is even valid for all —1/2 < t < 1/2. We now extend the 
case where = ±1 to the case where Cy is symmetric around zero and satisfies |Cy| < M almost 
surely. 

We write Cy = AyCy, where Ay = |Cy| < M a.s. and C*- = sign(Cy). Moreover, Ay and 
C*j are independent, since Cy has a symmetric distribution around zero. Thus, we obtain that 
S x ,i = <S^ jX)i , where (A»x)j = Ay-x,,-, and 



3=1 



For 5* i we know that (T4"6"]) is proven. Therefore, 

E fc [e*^i-*] < E fc [ 1 1 (48) 

for all i such that —1/2 < i||Ajx||2 < 1/2 almost surely. When | j x| | ^ = 1, we have that 

0<||A;x|| 2 <M a.s. (49) 

Therefore, EJe^^l < , 1 = for all < tM 2 \\x.\\?, < 1/2. Thus, we arrive at 

' L 1 - y/l-2M 2 t\\x\\l - II 112 - / ) 



'( s "Po< t <i/ A /2 (tot-log Vl _ 1 2M2t )) 



P fc «x,Wx) > a) < e v \ °V^^J>. (50) 

Note that since ||x||2 = 1, the bound is independent of x. performing the maximum over t on the 
right-hand side of (|50[) over t yields t* = ~ > anc ^ inserting this value t* in the right-hand side 
gives the result. □ 
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3.2 Proof of Theorem EH] 

The proof is similar to that of Theorem 12.11 For the proofs of ([55)) and (|4"T]) , we again use (TT? 
but now choose an x' of which only the first k components are non-zero. This leads to, using that 
k n — > oo, so that k n > k for n sufficiently large, 

liminf — logP fc „(A max > a) < sup (to - logE[e tS *M]) . (51) 

Maximizing over all x' of which only the first k components are non-zero leads to 

lim inf log P fc „ ( A max > a) < I k (a) , (52) 

where this bound is valid for all k 6 N. We next claim that 

lim I k (a) = I^{a). (53) 

k — >oo 

For this, we first note that the sequence k i— > /&(«) is non- increasing and non- negative, so that it has 
a pointwise limit. Secondly, Ik(a) > /^(a) for all k, since the possible choices of x in ([38]) is larger 
than the one in ([5]). Now it is not hard to see that lim^oo /fc(a) = I x {a), by splitting into the two 
cases depending on whether the infimum over x in ()38() is attained or not. This completes the proof 
of d39j) and (|4Tj). 

For the proof of (pf0|) and (|4"2"]) . we adapt the proof ([7]) and ©. As in the proof of Theorem 
I2.1f a-b). we wish to show that the terms ^ log Nd and 2dA max vanish when we take the logarithm of 
(|23|) . divide by n and let n — ► oo. However, this time we wish to let k n — ► oo as well, for k n as large 
as possible. We will have k n — o(n) in mind. 

The overestimation (|26j) can be improved using an upper bound for the number M R — N 1/R of 
spheres of radius 1/R needed to cover the surface of a fc-dimensional sphere of radius 1 (Rogers 
(1963)), when k — > oo, 

M R = 4k\/kR k (log k + log log k + log(i?)) (1 + 0(1/ log k)) = f(k,R)R k . (54) 

This bound is valid for R > J jrzi- Since we use small spheres this time, d < 1/R. 

We can also improve the upper bound for A max . For any f2„ > 1, which we will choose appropri- 
ately later on, we split 

Pmin(a) < P(A min < a, A max < ft n ) + P max (f7„), (55) 
PmM = P(a<A max <n„)+P mi «(O n ). (56) 

We first give a sketch of the proof, omitting the details. The idea is that the first term of these 
expressions will yield the rate function I^a). The term P max (^n) has an exponential rate which 
is 0(fl n ) — 0( " ogfcra ), and, since fc " '° s kn = o(logn), can thus be made arbitrarily large by taking 
£1„ = K log n with K > 1 large enough. This means that we can choose Vt n large enough to make this 
rate disappear according to the largest-exponent- wins principle (fl3|) . We will need different choices 
of R for the two terms. We will now give the details of the proof. 

We first bound P max (0„) of (J56l), using {23]). In {23]), we choose k — M 2 . This leads to 

Pmax(tt„) < M«supP((x,iyx) > Q n - 2dM 2 k n )+P(T w > M 2 k n ), 



where the supremum over x runs over the centers of the small balls. Inserting ([54]) . choosing R = k n 
and using d < 1/R, this becomes 

Pnax(^) < f(k„, K)kt su P P((x, Wx) > Q„ - 2Af 2 ) + P(T W > M 2 k n ). 

X 

Using Proposition [3^21 we find 

P^{n n ) < f{k n ,k n )kte-^~^°^-2)) +e ^/ cfi (M 2 ) 

= /(^,fc„)e fc " logfc "-^(^- 3 - los( ^- 2) ) +e"" fc " /c ? 1 (M2) . (57) 
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We choose VL n = K log n with K so large that 

k n logk n 1 , fl n . Q n , 

^^ < 4^- 3 - l0g( M^" 2) )' 

Therefore, also using that 

f(h h \ — P °( nl °& n ) 

we obtain 

P max (O n ) < e -lf^«log„(l +0 (l)) + -nk^M*) ^ (5g) 

Next, we investigate the first term of (f56)) . In this term, we can use 51„ as the upper bound for 
A max . Therefore, again starting with (f2"3")l . we obtain that, for any R, 

- - logP(A min < a, A max < n n ) > -i [log Af a + suplogP((x, Ifx) < a + 2Q n /R)] . (59) 
n n x 

For A max , we get a similar expression. Inserting (|54[) . we need to choose R again. This time we 
wish to choose R — R n to increase in such a way that k n logi?„ = o(n) and Q n — Klogn — o(R n ). 
For the latter, we need that R n ^ logn, so that we can only satisfy the first restriction when 
k n = o( i g log „ )■ Then this is sufficient to make the term 2£l n /R n disappear as k n — * oo, and to 
make the term i (log Mr^ ) = 0(k n log R n ) to be o(n), so that it also disappears. Therefore, for any 
R = R n satisfying the above two restrictions, 

- i logP(A min < a, A max < n„) > I kn {a + 2fl n /R n ) + o(l). (60) 

Similarly, 

- - logP(a < A max < n„) > I kn (a - 2n n /R n ) + o(l). (61) 

Moreover, by the fact that fl n = iflogn = o(R n ), we have that 2VL n /R n < e for all n large enough. 
By the monotonicity of an Ik n (a) , we then have that 

I kn (a + 2a n /R n ) >I kn (a + e), I kn (a - 2n n /R n ) > I kn {a - e). (62) 

Since lim^oo Ik(a) = I x (ct) (see ([55)) ). putting (|6ll)) . (f()2"|) and ([55]) together and applying the 
largest-exponent-wins principle (fT5)l . we see that the proof follows when 

/„(a±e) < minOogn^^^^M 2 )}. (63) 

Both terms are increasing in n, as long as /^(M 2 ) > 0. This is true for the Cn we consider: if Cn 
is symmetric around zero such that \Cn\ < M < oo almost surely, then I C 2^(M 2 ) = oo, and if C m i 
is standard normal then I C 2^(M 2 ) > 0. 

Therefore, (|63p is true when n is large enough, when /^(a ± e) < oo. On the other hand, when 
loo (a ± e) = oo, then we obtain that the exponential rates converge to infinity, as stated in (fJD]) and 
(|12")) . We conclude that, for every e > 0, 

lim inf log P fe „ ( A mi „ < a) > 7„ (a + e) , (64) 

n — >oo 72 

and 

lim inf logP fcri (A max > a) > /«,(a - e), (65) 

n — ^oo TL 

and letting e J. completes the proof for every sequence A; n such that k n — o( log ^ )gn ). The proof for 
A max is identical to the above proof. □ 

We believe that the above argument can be extended somewhat further, by making a further split 
into K' log logn < A max < Klogn and A max < if' log logn, but we refrain from writing this down. 
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3.3 The limiting rate for k large 

In this section, we investigate what happens when we take k large. In certain cases, we can show that 
the rate function, which depends on k, converges to the rate function for Wishart matrices. This will 
be formulated in the following theorem. 

Theorem 3.3 Assume that Cij satisfies (QP, and, moreover, that 4> c {t) < e* I 2 for all t. Then, for 
all a > 1, and all k > 2, 

h(a) > -(a - 1 - log a), (66) 



and, for all a > 1, 



lim J* (a) = /. (a) = J (a - 1 - log a). (67) 

k — >oo Z 

Finally, for all k n — > oo such that k n = o( log ™ n ) and a > 1, 

lim -- logP fc „(A max > a) = -(a - 1 - log a). (68) 

Note that, in particular, Theorem 13.31 implies that /^(a) = h(a — 1 — log a) > for all a > 1. 

Theorem 13.31 is a kind of universality result, and shows that, for k large, the rate functions of 
certain sample covariance matrices converges to the rate function for Wishart matrices. An example 
where 4>c{t) < e* I 2 holds is when dj — ±1 with equal probability. We will call this example 
the Bernoulli case. A second example is uniform random variable on [— V3, \/3], for which also the 
variance equals 1. We will prove these bounds below. 

Of course, the relation that 4> c {t) < e* I 2 for random variables with mean and variance 1, is 
equivalent to the statement that 4> c (t) < e* a I 2 for a random variable C with mean and variance 
a 2 . Thus, we will check the condition for uniform random variables on [—1, 1] and for the Bernoulli 
case. We will denote the moment generating functions by 4>u and <f>B- We start with the second, for 
which we have that 

n=0 v ' n=0 

since (2n)\ > 2 n nl for all n > 0. The proof for (f>u is similar. Indeed, 

Mt) = *^m = jr < v A = e t2/6 = (70) 

w t ^ (In + 1 ! ~ ^ 6™n! v ; 

n=0 y ' n=0 

since now (2n + 1)! > Q n nl for all n > 0. 

Proof of Theorem l3.3l Using Theorem l2.ll and l3.ll we claim that it suffices to prove that, uniformly 
for x with ||x|| 2 = 1 and t < 1, 

E [e <.] < - 7T L=. (71) 

We will prove ([7l|) below, and first prove Theorem 13.31 assuming that ([7T|) holds. 

When ([7T|) holds, then, using Theorem 13.11 and |5j), it immediately follows that (l66|) holds. Here 
we also use that a i— > \{a — 1 — log a) is continuous, so that the limit over e J. can be computed. 

To prove ([67)) . we take x = -^(1, . . . , 1), to obtain that, with S7. = J^ i=1 Cn, 

I k {a) < sup (ta - logE[e^ s "]) . (72) 

We claim that, when k — ► 00, for all < f < 1, 

E[ e i^]_E[ e « fa ] = - ;T L= 1 (73) 

where Z is a standard normal random variable. This implies the lower bound for Ik{o), and thus 
([BT]) . Equation (|68[) follows in a similar way, also using that a |(a — 1 — log a) is continuous. 
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We complete the proof by showing that (|7Tj) and (|75|) hold. We start with (fTTj) . We rewrite, for 
t > 0, and writing Z for a standard normal random variable, 



k 



E[e<-] =E[e^* zs -] =E[[]0 c (V2iZx J )]. (74) 
We now use that (j) c {t) < e* / 2 to arrive at 



fe 



E[e tS H<4ne^] = E[e«"] = ^=4=. (75) 



This completes the proof of (|7T|) . We proceed with ([73")) . We use 

fe 



E[e ts i}=E[HM\ljZ)]. (76) 



We will use dominated convergence. By the assumption, we have that tfidy^Z) < e>> z , so that 

( t ) c{\J^;Z) < e tz2 , which has a finite expectation when t < 1/2. Moreover, Ylj=i c l ) c(\f^ z ) 

converges to e' z pointwise in z. Therefore, dominated convergence proves the claim in (I73|) . and 
completes the proof. □ 

4 The smallest eigenvalue for C%j = ±1 

Unfortunately, we are not able to prove a similar result as in Theorem [373] for the smallest eigenvalue. 
In fact, as we will comment on in more detail in Section |4~21 below, we expect the result to be false 
for the smallest eigenvalue, in particular when a is small. There is one example where we can prove 
a partial convergence result, and that is when = ±1 with equal probability. Indeed, in this case 
it is shown in [HI Section IV] that (j7T|) holds for alii > —1. This leads to the following result, which 
also implies that Jfe(a) > for a 7^ 1 in the case where Var(C^ 1 ) = (recall also Theorem 12. 1[) : 



Theorem 4.1 Assume that Cij — ±1 with equal probability. Then, for all a > 1/2, and all k > 2, 



and, for all a > 1/2, 
Finally, for all < a < 1/2, 



h(a) >i(a-l-loga), (77) 



7 00 (a) = i(a-l-loga). (78) 



h{a) >i(-a + log2). (79) 



Proof. The proof of ([77H78]) is identical to the proof of Theorem l3.3[ now using that (|7ip holds for 
all t > —1. Equation ([79)) follows since 2fc(a) > irrf x: || x || 2= i ^— ^ — logE[e~2 Sx i ]^ and the bound on 
the moment generating function for t = — 1. □ 

4.1 Rate for the probability of one or more zero eigenvalues for CV, = ±1 

In the above computations, we obtain no control over the probability of a large deviation of the 
smallest eigenvalue A m i n . In this and the next section, we investigate this problem in the case where 

Ca - ±i. 
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Proposition 4.2 Suppose that Cij — ±1 with equal probability. Then, for all < I < k — 1, and any 
k n = 0{n b ) for some b, 

lim --logP fe „(Ai = ... = A, =0) = Zlog2, (80) 

n— >oo n 

where the A; denote eigenvalues of W arranged in increasing order. 

Proof. The upper bound in (|80[) is simple, since, to have I eigenvalues equal to zero, we can take 
the first / + 1 columns of C to be equal. For eigenvectors w of W, 

(w,Ww) = -(w,CC T w) = I||C T w||2, 
n n 

we obtain that w is an eigenvector with eigenvalue zero precisely when ||C T w|| 2 = 0. When the first 
I + 1 columns of C are equal, then there are I linearly independent vectors for which ||C T w|| 2 = 0, 
so that the multiplicity of the eigenvalue zero is at least I. Moreover, the probability that the first 
I + 1 columns of C are equal is equal to 2 nl . 

We prove the lower bound in (I80|) by induction by showing that 

lim -ilogP fe „(Ai =... = A; = 0) >Zlog2. (81) 

When I = 0, then the claim is trivial. It suffices to advance the induction hypothesis. 

Suppose that there are I linear independent eigenvectors with eigenvalue zero. Since the eigen- 
vectors can be chosen to be orthogonal, it is possible to make linear combinations, such that the first 
I — 1 all have one zero coordinate j, whereas the I th has all coordinates zero except coordinate j. 
This means that the first I — 1 eigenvectors fix some part of C T , but not the j th column. The I th 
eigenvector however fixes precisely this column. Fixing one column of C T has probability 2~ n . The 
number of possible rows j is bounded by fc, which is turn is bounded by n b = e°( n K Therefore, we 
have 

lim --logP fc „(A 1 = ... = A / = 0)>log2+ lim -- ]ogP fc „(Ai = . . . = A,_ x = 0). (82) 

n — >oo Jl n — >oo n 

The claim follows from the induction hypothesis. □ 

Note that Proposition 14. 21 shows that ([75)) cannot be extended to a — 0. Therefore, a changeover 
takes place between a = and a > i , where for a > ^ , the rate function equals the one for Wishart 
matrices, while for a = 0, this is not the case. We will comment more on this in Conjecture | 
below. 



4.2 A conjecture about the smallest eigenvalue for = ±1 

We have already shown that {77J is sharp. By Proposition ^. 2[ ([75)1 is not sharp, since / m i n (0) = log 2, 
whereas ([751) only yields lim Q j. ^min(a) > ^\og2. 

We can use CLI]) again with x' = ^=(1, 1,0, • • •)■ For this vector, E{e tS *'-'} = \(e 2t + 1), and 
calculating the according rate function gives Ik (at) < I (2) (a) — ^loga + log(2 — a), which 
implies that Um^o Ik {&) < log 2. 

It appears that below a certain a — a* k , the optimal strategy changes from x (fc) = ^(1, !,•■•) to 

x (2) = -4=(1, 1, 0, - • •). In words, that means that for not too small eigenvalues of W, all entries of C 
contribute equally to create a small eigenvalue. However, smaller values of the smallest eigenvalues 
of W are created by only two columns of C, whereas the others are close to orthogonal. Thus, a 
change in strategy occurs, which gives rise to a phase transition in the asymptotic exponential rate 
for the smallest eigenvalue. 

We have the following conjecture: 

Conjecture 4.3 For each k and all a > 0, there exists an a* = a* k > so that 

I k (a) = J (3 ' (a) = | log a + log (2 - a), 
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for a < a* k . For a > a* k , 

I w {a) > I k (a) > /.(a) = ~(a - 1 - log a). 

For A; — * oo, £/ie Zasi inequality will become an equality. Consequently, limfc-^oo = a*, which is the 
positive solution of I^ipc) — I (2) (a). 

For k = 2, the conjecture is trivially true, since the two optimal strategies are the same, and the 
only possible. Note that in the proof of Proposition 14.21 we used that to have a zero eigenvalue, we 
need two vectors of C to be equal. Thus, the conjecture is also proven for a — 0. Furthermore, with 
Theorem 13.21 and (|46|) , the conjecture follows for all k for a>\. We lack a proof for < a < |. 
Numerical evaluation gives that at, ~ 0.425, and for k — ► oo, a* k m 0.253. We have some evidence 
that suggests that oi£ decreases with fc. 

5 An application: Mobile Communication Systems 

Our results on the eigenvalues of sample covariance matrices was triggered by a problem in mobile 
communication systems. In this case, we take the C matrix as a coding sequence, for which we can 
assume that the elements are ±1. Thus, all our results apply to this case. In this section, we will 
describe the consequences of our results on this problem. 

5.1 Soft-Decision Parallel Interference Cancellation 

In Code Division Multiple Access (CDMA) communication systems, each of k users multiplies his 
data signal by an individual coding sequence. The base station can distinguish the different messages 
by taking the inner product of the total signal with each coding sequence. This is called Matched 
Filter (MF) decoding. An important application is mobile telephony. Since, due to synchronisation 
problems, it is unfeasible to implement completely orthogonal codes for mobile users, the decoded 
messages will suffer from Multiple Access Interference (MAI). In practice, pseudo-random codes are 
used. Designers of decoding schemes are interested in the probability that a decoding error is made. 

In the following, we explain a specific method to iteratively estimate and subtract the MAI, 
namely, Soft Decision Parallel Interference Cancellation (SD-PIC). For more background on SD-PIC, 
see [5], [8], [TU] and [TB], as well as the references therein. Because this procedure is linear, it can be 
expressed in matrix notation. We will show that the possibility of a decoding error is related to a 
large deviation of the maximum or minimum eigenvalue of the code correlation matrix. 

To highlight the aspects that are relevant in this article, we suppose that each user sends only 
one data bit b m S {+1, —1}, and we omit noise from additional sources. We can denote all sent data 
multiplied by their amplitude in a column vector Z, i.e., Z m = >/P m b m , where P m is the power of the 
m th user. The k codes are modeled as the different rows of length n of the code matrix C, consisting 
of i.i.d. random bits with distribution 

nc mi = +1) = P(C roi = -1) = i. 

Thus, k plays the role of the number of users, while n is the length of the different codes. 

The base station then receives a total signal s = C T Z. Decoding for user m is done by taking 
the inner product with the code of the m th user (C lm , . . . , C nm ), and dividing by n. This yields an 
estimate Z^> for the sent signal Z m . In matrix notation, the vector Z is estimated by 

Z (1) = -Cs = WZ. 
n 

Thus, we see that multiplying with the matrix W is equivalent to the MF decoding scheme. In order 
to estimate the signal, we must find the inverse matrix W~ l . From Z (1) , we estimate the sent bit b rn 

by 

6£ = S ign(ZW) (83) 

(where, when Z 1 ^ = 0, we toss an independent fair coin to decide what the value of sign(Z^) is). 
Below, we explain the role of the eigenvalues of W in the more advanced SD-PIC decoding scheme. 
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The MF estimate contains MAI. When we write Z = Z + (W — J)Z, it is clear that the estimated 
bit vector is a sum of the correct bit vector and MAI. In SD-PIC, the second term is subtracted, with 
Z replaced by Z. In the case of multistage PIC, each new estimate is used in the next PIC iteration. 
We will now write the multistage SD-PIC procedure in matrix notation. We number the successive 
SD estimates for Z with an index s, where s = 1 corresponds to the MF decoding. In each new 
iteration, the latest guess for the MAI is subtracted. The iteration in a recursive form is therefore: 

Z W =z w _ (W-ijfr'-V. (84) 

This can be worked out to 

s-l 

=^{I -WfWZ. (85) 

We then estimate b m by 

S«=sign(Z£). (86) 

When s — > oo, the series X^=o(^ ~ W) q converges to W -1 , as long as the eigenvalues of W are 
between and 2. Otherwise, a decoding error is made. This is the crux to the method, see also [21] 
for the above matrix computations. When k = o(n/ log log n), the values A m j n = and A max > 2 are 
large deviations, and therefore our derived rate functions provide information on the error probability. 
In the next section, we will describe these results, and we will also obtain bounds on the exponential 
rate of a bit error in the case that s is fixed and k is large. For an extensive introduction to CDMA 
and PIC procedures, we refer to [18]. 



5.2 Results for Soft-Decision Parallel Interference Cancelation 

There are two cases that need to be distinguished, namely, the case where s — > oo, and the case 
where s is fixed. We start with the former, which is simplest. As explained in the previous section, 
due to the absence of noise, there can only be bit-errors when A m ; n = or when A max > 2. By ([T5|) , 
the rate of A m i n = is at least \ log 2 « 0.35..., whereas the rate of A max > 2 is bounded below by 
2 — ^ log 2 0.15... The latter bound is weaker, and thus, by the largest-cxponent-wins principle, we 
obtain the following result: 

Theorem 5.1 (Bit-error rate for optimal SD-PIC) For all k fixed, or for k = k n — > oo such 
that fc n = ( I _g_) J 

- - logP fc (3m = 1, . . . , k for which lim f b m ) >\-\ log 2. (87) 

We emphasize that in the statement of the result, we write that lim^oo b$ ^ 6 m Vm = 1, . . . , k for 
the statement that either lim^oo 6^' does not exist, or that lim^oo b ( ^ exists, but is unequal to b m . 
We observe that when A max > 2, then 

s-l 

z (s) =^( 7 ~ W Y WZ ( 88 ) 

oscillates, so that we can expect there to be errors in every stage. This is sometimes called the 
ping-pong effect (see [3T] ). Thus, one would expect that 

-- logP fc „ (3m = 1, . . . , k for which lim b% ± b m ) = i - i log 2. 

However, this depends also on the relation between Z and the eigenvector corresponding to A max . 
Indeed, when Z is orthogonal to the eigenvector corresponding to A max , then the equality does not 
follow. To avoid such problems, we stick to lower bounds on the rates in this section, rather than 
asymptotics. 

We next go to the case where s is fixed. We again consider the case where k is large and fixed, 
or that k = k n — > oo. In this case, it can be expected that the rate converges to as k — > oo. We 
already know that the probability that A max > 2 or A m ; n = is exponentially small with fixed strictly 
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positive lower bound on the exponential rate. Thus, we shall assume that < A m i n < A max < 2. We 
can then rewrite 



z w = J2( J - wywz = [/-(/- wy] z. 



(89) 



For simplicity, we will first assume that Zj = ±1 for all i = 1, . . . , k, which is equivalent to assuming 
that all powers are equal. When s is fixed, we cannot have any bit-errors when 



((i-wyz). 



< i. 



We can bound 



{(i-wyz) 



<4l|b|| 2 , 



(90) 



(91) 



where e k = max{l — A m j n ,A max — 1}. Since ||b||2 = Vk, we obtain that there cannot be any bit- 
errors when £%Vk < 1. This gives an explicit relation between the bit-errors and the eigenvalues of a 
random sample covariance matrix. By applying the results from the previous two sections, we obtain 
the following theorem: 

Theorem 5.2 (Bit-error rate for finite-stage SD-PIC and k fixed) For all k such that k > 

2 2s 



1 



1 



liminf-logP fc (3m = 1, k for which lim 6$ ^ b m ) > —j=[l + 0(- 1 =) 



1 



</k' 



(92) 



When the signals are different, related results can be obtain in terms of the minimal and maximal 
clement of Z. Wc will not write this case out. 



Proof. By the computation in (|9ip , there can be no bit-errors when 1 — A m i n and A max — 1 are both 
at most 1/ 2 \fk. Thus, 

P k (3m = 1, . . . , k for which lim S« + b m ) < P k (A min < 1 - -±=) + P k (A max > 1 + — =) . (93) 
Each of these terms is bounded by, using Theorem 12.11 

-n min { lim^ I k (l-^-e) A (l+^+e) }(l+o(l)) _ ^ 

Since, by Theorem 14. II and a > \, we have that I k {a) > ^(a) = 5 (a — 1 — log a), and 

I 00 (a) = ^{a-l) 2 + 0(\a-l\ 3 ), (95) 

the result follows when k is so large that 1 — ^= > |. The latter is equivalent to k > 2 2s . □ 
We finally state a result that applied to k — k n : 

Theorem 5.3 (Bit-error rate for finite-stage SD-PIC and k = k n ) For k n = o( "^^ ), 

\/k~ - 11 

" logP fc „ (3m = 1, . . . , k for which lim b<£ ^ b rn ) >- + 0(\—=). (96) 



--0 - r*n \ 3 7 j - m I -nb/ — a ' - v - /I 

n s^oo 4 \/Kn 

Proof. We use (|93]l . to conclude that we need to derive bounds for Pfe n (A m ; n < 1 2=75=) an d 

Pfc„ (A max > 1 + t|7J=) ■ Unfortunately, the bounds 1 — and 1 + 2 ^ k on the smallest and largest 
eigenvalues depend on n, rather than being fixed. Therefore, we need to adapt the proof of Theorem 

i3~n 

We note that, by Theorem 13. 11 

P fc „(A max >2)^ e -^^^ 2 Wi+o(D). 
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Then, we use (p0|) with fi n = 2, and choose R n such that 

2 / 1 



R V \ 3 fh~ 



(97) 



so that R n 3> V^n- Applying ([59)) and (I54]) , we see that we need that = e ^ so that 

fc„ = o( " o ^ ) is sufficient. Finally, by Theorem 14. II and ([55]) . we have that 

J fe „(l± * ) > -±= (l + 0( -L )). (98) 



4\//c n V \kn 

This completes the proof. □ 

We now discuss the above results. In [13] . it was conjectured that when s = 2, the rate of a single 
bit error for a fixed user is asymptotic to ^= when fc — * oo. See also [13]. We see that we obtain a 
similar result, but our constant is 1/4 rather than the expected 1/2. On the other hand, our result 
is valid for all s > 2. 

Related results where obtained for a related model, Hard-Decision Parallel Interference Cance- 
lation (HD-PIC) where bits are iteratively estimated by bits, i.e., the estimates are rounded to ±1. 
Thus, this scheme is not linear, as SD-PIC is. In [HI [16], similar results as the above are obtained, 

and it is shown that the rate for a bit-error for a given user is asymptotic to | y| when s is fixed 
and k — > oo. This result is similar in spirit as the one in Theorem 15.21 above. The explanation of 
why the rate tends to zero as l/\/~k is much simpler for the case of SD-PIC, where the relation to 
eigenvalues is rather direct, compared to the explanation for HD-PIC, which is much more elaborate. 
It is interesting to see that both when s = oo and when s is finite and k — > oo, the rates in the two 
systems are of the same order. 

Interestingly, in [17j . it was shown that for s — 1 and k n = - 1 " - , with high probability, all bits 
are estimated correctly when 7 < 2, while, with high probability, there is at least one bit-error when 
7 > 2. Thus, k n = O(i^) is critical for the MF system, where we do not apply SD-PIC. For SD-PIC 
with an arbitrary number of staged of SD-PIC, we have no bit-errors with large probability for all 
k n = 7l " grt for all 7 > 0, and we can even pick larger values of k n such that k n = ( \ o ,\ ogn )- Thus, 
SD-PIC is more efficient than MF, in the sense that it allows more users to transmit without creating 
bit-errors. Furthermore, in [17] . the results proved in this paper are used for a further comparison 
between SD-PIC, HD-PIC and MF. Unfortunately, when we only apply a finite number of stages of 

SD-PIC, we can only allow for k n = o( " " g + ^ ) users. Similar results were obtained for HD-PIC when 

fc « = (i5irr)- 

We close this discussion on SD-PIC and HD-PIC by noting that for k = (in, A min converges to 
(1 — while the largest eigenvalue A max converges to (1 + V/3) 2 (see [21 [4] [23]). This is explained 

in more detail in [5], and illustrates that SD-PIC has no bit-errors with probability converging to 1 
whenever fi < {-\/2 — l) 2 1=3 0.17... However, unlike the case where k n = o( l ™ n ), we do not obtain 
bounds on how the probability of a bit-error tends to zero. 

A further CDMA system is the decorrelator, which explicitly inverts the matrix W (without 
approximating it by the partial sum X^=o(-^ — WY). One way of doing so is to fix a large value M 
and to compute 

t^=M-^(I-^ywZ, (99) 

<;=0 

and 

Sw M = sign(Z^ J. (100) 

This is a certain weighted SD-PIC scheme. This scheme will converge to b as s — > 00 whenever 
Amin > and A max < M. By taking M such that I^M) > log 2, and using Proposition 14. 2i we 
obtain the following result: 

Theorem 5.4 (Bit-error rate for optimal weighted SD-PIC) For all k fixed, or for k — k n — > 

00 such that k n = o( log \ ogn ) an d M such that /^(M) > log 2, 

- - logP fe (3m = 1, . . . ,fc for which &£> M ^ & m ) > l og 2. (101) 

n ' 
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The above result can even be generalised to k n that grow arbitrarily fast with n, by taking M 
dependent on n. For example, when we take M > k n , then A max < k n < M is guaranteed. 

Further interesting problems arise when we allow the received signal to be noisy. In this case, 
the bit-error can be caused either by the properties of the eigenvalues, as in the case when there is 
no noise, or by the noise. When there is noise, weighted SD-PIC for large M enhances the noise, 
which makes the problem significantly harder. See [18] for further details. A solution to Conjecture 
14.31 may prove to be useful in such an analysis. 
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